\chapter{Summary and Conclusions}

How to best quantify the comparisons between model predictions and experiments is not obvious. The necessary and perceived level of agreement for any variable is dependent upon both the typical use of the variable in a given simulation, the nature of the experiment, and the context of the comparison in relation to other comparisons being made. For instance, the user may be interested in the time it takes to reach a certain temperature in the room, but have little or no interest in peak temperature for experiments that quickly reach a steady-state value. Insufficient experimental data and understanding of how to compare the numerous variables in a complex fire model prevent a complete validation of the model.

\section{Summary of CFAST Model Uncertainty Statistics}

A true validation of a model would involve proper statistical treatment of all the inputs and outputs of the model with appropriate experimental data to allow comparisons over the full range of the model. Thus, the comparisons of the differences between model predictions and experimental data discussed here are intentionally simple and vary from test to test and from variable to variable due to the changing nature of the tests and typical use of different variables. The following table lists the summary statistics for the different quantities of interest examined in this Guide. the table lists the bias and relative standard deviation of the predicted values. It also lists the total number of experimental data sets on which these statistics are based, as well as the total number of point to point comparisons. Obviously, the more data sets and the more points, the more reliable the statistics.

For further details about model uncertainty and the meaning of these statistics, see the Refs. \cite{FDS_Validation_Guide_6, NRCNUREG1824}.

\begin{table}

\vspace{0.1in}

\IfFileExists{SCRIPT_FIGURES/ScatterPlots/validation_statistics.tex}{
\input{SCRIPT_FIGURES/ScatterPlots/validation_statistics.tex}}{\typeout{Error: Missing file SCRIPT_FIGURES/ScatterPlots/validation_statistics.tex}}
\end{table}

CFAST predictions in this validation study were consistent with numerous earlier studies, which show that the use of the model is appropriate over a range of conditions for a variety of fire scenarios.  The CFAST model has been subjected to extensive evaluation studies by NIST and others (see, for example Ref. \cite{NRCNUREG1824} and Chapter \ref{Survey_Chapter}.  Although differences between the model and the experiments were evident in these studies, most differences can be explained by limitations of the model and the experiments.  Like all predictive models, the best predictions consider the limitations of the model and of the inputs provided to perform the calculations.

\clearpage

\section{Normality Tests}
\label{normality_tests}

The histograms on the following pages display the distribution of the quantity $\ln(M/E)$, where $M$ is a random variable representing the \underline{M}odel prediction and $E$ is a random variable representing the \underline{E}xperimental measurement. From the development of the statistics used to compare model and experimental values~\cite{FDS_Validation_Guide_6}, $\ln(M/E)$ is assumed to be normally distributed. To test this assumption for each of the quantities of interest listed in Table~\ref{summary_stats}, Spiegelhalter's normality test has been applied~\cite{Spiegelhalter:Biometrika1983}. This test examines a set of values, $x_1,...,x_n$ whose mean and standard deviation are computed as follows:
\be
   \bar{x} = \sum_{i=1}^n x_i  \quad ; \quad \sigma^2 = \frac{1}{n-1}  \sum_{i=1}^n \left( x_i - \bar{x} \right)^2
\ee
Spiegelhalter tests the null hypothesis that the sample $x_i$ is taken from a normally distributed population. The test statistic, $\rm S$, is defined:
\be
   {\rm S} = \frac{N-0.73 \, n}{0.9 \, \sqrt{n}}  \quad ; \quad N=\sum_{i=1}^n Z_i^2 \, \ln \, Z_i^2  \quad ; \quad Z_i = \frac{x_i - \bar{x}}{\sigma}
\ee
Under the null hypothesis, the test statistic is normally distributed with mean 0 and standard deviation of 1. If the $p$-value
\be
   p = 1 - \left| \erf \left( \frac{{\rm S}}{\sqrt{2}} \right) \right|
\ee
is less than 0.05, the null hypothesis is rejected.

The flaw in most normality tests is that they tend to reject the assumption of normality when the number of samples is relatively large. As can be seen in some of the histograms on the following pages, some fairly ``normal'' looking distributions fail while decidedly non-normal distributions pass. For this reason, the p-value is less important than the qualitative appearance of the histogram. A best-fit Gaussian curve is also shown in the figures. If the histogram exhibits the typical bell-shaped curve, this adds confidence to the statistical treatment of the data. If the histogram is not bell-shaped, this might cast doubt on the statistical treatment for that particular quantity.

\IfFileExists{SCRIPT_FIGURES/ScatterPlots/validation_histograms.tex}{\input{SCRIPT_FIGURES/ScatterPlots/validation_histograms.tex}}{\typeout{Error: Missing file SCRIPT_FIGURES/ScatterPlots/validation_histograms.tex}}


